An Open Source Tool for Partial Parsing and Morphosyntactic Disambiguation
نویسندگان
چکیده
This article presents a formalism and an open source implementation of a new tool for simultaneous partial parsing and morphosyntactic disambiguation and correction. We argue that, contrary to the common pipeline approach, where morphosyntactic tagging is fully accomplished before shallow or partial parsing, both tasks are best approached in parallel. This has been suggested before, and formalisms which allow for the interweaving of partial parsing and morphosyntactic disambiguation have been proposed. Our approach is novel in that a fully uniform formalism is presented, and a single grammar rule may contain structure-building operations, as well as morphosyntactic correction and disambiguation operations. The formalism has been implemented in Java and is now available under the GNU General Public License.
منابع مشابه
spade Demo: An Open Source Tool for Partial Parsing and Morphosyntactic Disambiguation
The paper presents Spejd, an Open Source Shallow Parsing and Disambiguation Engine. Spejd (abbreviated to ♠) is based on a fully uniform formalism both for constituency partial parsing and for morphosyntactic disambiguation — the same grammar rule may contain structure-building operations, as well as morphosyntactic correction and disambiguation operations. The formalism and the engine are more...
متن کاملSpejd: A Shallow Processing and Morphological Disambiguation Tool
This article presents a formalism and a beta version of a new tool for simultaneous morphosyntactic disambiguation and shallow parsing. Unlike in the case of other shallow parsing formalisms, the rules of the grammar allow for explicit morphosyntactic disambiguation statements, independently of structure-building statements, which facilitates the task of the shallow parsing of morphosyntactical...
متن کاملAn Implementation of Combined Partial Parser and Morphosyntactic Disambiguator
The aim of this paper is to present a simple yet efficient implementation of a tool for simultaneous rule-based morphosyntactic tagging and partial parsing formalism. The parser is currently used for creating a treebank of partial parses in a valency acquisition project over the IPI PAN Corpus of Polish.
متن کاملVerbal Morphosyntactic Disambiguation through Topological Field Recognition in German-Language Law Texts
The morphosyntactic disambiguation of verbs is a crucial pre-processing step for the syntactic analysis of morphologically rich languages like German and domains with complex clause structures like law texts. This paper explores how much linguistically motivated rules can contribute to the task. It introduces an incremental system of verbal morphosyntactic disambiguation that exploits the conce...
متن کاملJoBimText Visualizer: A Graph-based Approach to Contextualizing Distributional Similarity
We introduce an interactive visualization component for the JoBimText project. JoBimText is an open source platform for large-scale distributional semantics based on graph representations. First we describe the underlying technology for computing a distributional thesaurus on words using bipartite graphs of words and context features, and contextualizing the list of semantically similar words t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007